LGen - A Lattice-Based Candidate Set Generation Algorithm for I/O Efficient Association Rule Mining

نویسندگان

  • Chi Lap Yip
  • K. K. Loo
  • Ben Kao
  • David Wai-Lok Cheung
  • C. K. Cheng
چکیده

Most algorithms for association rule mining are variants of the basic Apriori algorithm One characteristic of these Apriori based algorithms is that candidate itemsets are generated in rounds with the size of the itemsets incremented by one per round The number of database scans required by Apriori based algorithms thus depends on the size of the largest large itemsets In this paper we devise a more general candidate set generation algorithm LGen which generates candidate itemsets of multiple sizes during each database scan We show that given a reasonable set of suggested large itemsets LGen can signi cantly reduce the number of I O passes required In the best cases only two passes are su cient to discover all the large itemsets irrespective of the size of the largest ones

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining

Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...

متن کامل

Association Rule Mining based on Apriori Algorithm in Minimizing Candidate Generation

Association Rule Mining is an area of data mining that focuses on pruning candidate keys. An Apriori algorithm is the most commonly used Association Rule Mining. This algorithm somehow has limitation and thus, giving the opportunity to do this research. This paper introduces a new way in which the Apriori algorithm can be improved. The modified algorithm introduces factors such as set size and ...

متن کامل

Effective Positive Negative Association Rule Mining Using Improved Frequent Pattern Tree

Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...

متن کامل

Effective Positive Negative Association Rule Mining Using Improved Frequent Pattern

Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...

متن کامل

An Algorithm for Finding Frequent Itemset based on Lattice Approach for Lower Cardinality Dense and Sparse Dataset

Whenever mining association rules work for large data sets frequently itemset always play an important role and enhance the performance. Apriori algorithm is widely used for mining association rule which uses frequent item set but its performance can be improved by enhancing the performance of frequent itemsets. This paper proposes a new novel approach to finding frequent itemsets. The approach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999